Emoji Could be Joined
Date: 2023-03-04
Apparently, this โ๐จโ๐ฉโ๐ฆโ or Family: Man, Woman, Boy emoji is consisted of multiple emojis that are combined or joined.
$ node
Welcome to Node.js v19.7.0.
Type ".help" for more information.
> [..."๐จโ๐ฉโ๐ฆ"]
[ '๐จ', 'โ', '๐ฉ', 'โ', '๐ฆ' ]
or the other way around:
> `๐จ\u{200D}๐ฉ\u{200D}๐ฆ`
'๐จโ๐ฉโ๐ฆ'
Note: You might or might not seeing the family emoji as a single character glyph when you run the above code on your terminal. In order to render this emoji correctly, you have to use fonts that support emoji (like Google Noto font) and using a terminal that has support for rendering multi-width cell Unicode character (like kitty).
Huh. I thought all this time that every Unicode characters (including emojis) are represented by their unique code (which is true, but I thought it more like ascii characters). Today I learned that they can be joined or concatenated.
If you take a look back at the above example. The emoji โ๐จโ๐ฉโ๐ฆโ consists of 5 other Unicode characters or code point, which 3 of them is the Man emoji (๐จ), Woman emoji (๐ฉ), and the Boy emoji (๐ฆ). When you reconstruct the 3 emojis into one like in the second code snippet, itโll return the family emoji back.
So what is that empty char ('โ'
) then? What is this \u{200D}
(or 0x200D
) thing?
Both of those are called ZWJ or Zero Width Joiner. This invisible character act as glue or joiner for emoji sequence to create another emoji. AFAICT, this ZWJ sole purpose is to join a sequence of emoji and it is not required if you want to join non-emoji Unicode characters like this one: eฬ (Latin Small Letter E followed by Combining Acute Accent)
> [...'eฬ']
[ 'e', 'ฬ' ]
Right then. Letโs play around with combining some emojis!
Suppose you have this โ๐งโ or Person with Beard emoji and you want to create the male version of it (yes, this โPerson with Beardโ emoji is supposed to be gender neutral). I could guess what some of you might think right now: โLetโs join this ๐ง emoji with the ๐จ or Man emoji!โ.
> `๐ง\u{200D}๐จ`
'๐งโ๐จ'
Andโฆ it didnโt work? It just print both emoji characters side-by-side. Not this single ๐งโโ๏ธ or Man with a Beard emoji we expected to see.
Okay letโs cheat this time.
> [...'๐งโโ๏ธ']
[ '๐ง', 'โ', 'โ', '๏ธ' ]
Uh oh! So youโre not joining it with the โManโ emoji, instead you use the โโ๏ธโ or Male Sign emoji. But, why another ZWJ in the end?
> for (const c of "๐งโโ๏ธ") { console.log(`0x${c.codePointAt(0).toString(16)}`) }
0x1f9d4
0x200d
0x2642
0xfe0f
It turns out that last character is not a ZWJ! It is Variation Selector-16 (U+FE0F). This another invisible character is used to specifies that the preceding character should be displayed in the emoji presentation, instead of its default text-mode display.
We could then use the above information to createโฆ the female version of โPerson with Beardโ emoji.
> `๐ง\u{200D}โ\u{FE0F}`
'๐งโโ๏ธ'
Want to add a skin tone? Follows the โPerson with Beardโ emoji with the skin tone of your choice without using ZWJ. Letโs pick this โU+1F3FBโ code point or Light Skin Tone as example.
> `๐ง\u{1F3FB}\u{200D}โ\u{FE0F}`
'๐ง๐ปโโ๏ธ'
Why using code point this time?
Because we arenโt using ZWJ to join the skin tone, the browser will read the โPerson with Beardโ emoji followed by the โLight Skin Toneโ emoji as a valid sequence and then will be rendered as single glyph like this:> `๐ง๐ป\u{200D}โ\u{FE0F}` '๐ง๐ปโโ๏ธ'
Using code point will show you clearly that we appending the skin tone emoji (or code point) after the person emoji.
It will be fun to do some โGuess the Codepointโ game I think?
Letโs start from the easiest one.
Guess the Codepoint!
-
โค๏ธโ๐ฅ(Heart on Fire)?
-
๐ฎโ๐จ (Face Exhaling)?
Show me
It is Face with Open Mouth and Dashing away.
> [...'๐ฎโ๐จ'] [ '๐ฎ', 'โ', '๐จ' ]
You need to have your mouth open to be able to exhale, right?
-
๐ตโ๐ซ (Face with Spiral Eyes)?
Show me
It is Face with Crossed-Out Eyes and Dizzy.
> [...'๐ตโ๐ซ'] [ '๐ต', 'โ', '๐ซ' ]
I donโt know the differences between โxโ eyes and spiral eyes symbolically, but both are similarly used to represent dizziness (in this emoji context).
-
๐จโ๐ผ (Man Feeding Baby)?
Show me
It is Man and Baby Bottle. (Huh, where is the baby??)
> [...'๐จโ๐ผ'] [ '๐จ', 'โ', '๐ผ' ]
Also, hope we can get Man breast-feeding soon!
-
๐ปโโ๏ธ (Polar Bear)?
Ref: https://fasterthanli.me/articles/the-bottom-emoji-breaks-rust-analyzer